Constrained future brightening of solar radiation and its implication for China's solar power

Abstract As Earth's primary energy source, surface downward solar radiation (Rs) determines the solar power potential and usage for climate change mitigation. Future projections of Rs based on climate models have large uncertainties that interfere with the efficient deployment of solar energy to achieve China's carbon-neutrality goal. Here we assess 24 models in the latest Coupled Model Intercomparison Project Phase 6 with historical observations in China and find systematic biases in simulating historical Rs values likely due to model biases in cloud cover and clear-sky radiation, resulting in largely uncertain projections for future changes in Rs. Based on emergent constraints, we obtain credible Rs with narrowed uncertainties by ∼56% in the mid-twenty-first century and show that the mean Rs change during 2050–2069 relative to 1995–2014 is 30% more brightening than the raw projections. Particularly in North China and Southeast China with higher power demand, the constrained projections present more significant brightening, highlighting the importance of considering the spatial changes in future Rs when locating new solar energy infrastructures.


INTRODUCTION
To keep global warming to <2 • C by the end of this century, renewable energy resources, such as solar energy generated by solar photovoltaic (PV) systems, have become increasingly important in the total primary energy supply [1][2][3]. Surface downward solar radiation (R s , with a wavelength of 0.2-4.0 μm) is the most critical indicator to know about the PV potential. To meet the carbon-neutrality goal, China has invested heavily in PV systems in the last decade. By the end of 2021, the installed capacity of the PV systems nationwide reached 306 GW, ranking first in the world for six consecutive years since 2015. Moreover, China plans 3550 GW of installed PV capacity by 2060, accounting for 45% of the total power generation in China [4]. Thus, robust estimates of future trends and uncertainties in solar radiation are crucial for realizing this promise of energy structure optimization.
Historical ground observations reveal that R s declined from the 1950s to the 1980s and increased thereafter, known as global dimming and brightening [5]. In China, observational data show significant decadal variations in R s in China with a dimming during the 1960s to 1990s and flattening thereafter ( Supplementary Fig. S1a) [5][6][7][8][9]. Prior studies showed that the models that participated in the Coupled Model Intercomparison Project Phase 6 (CMIP6) overestimate the global mean R s and underestimate R s trend in China [10,11], but the underlying drivers of the R s bias in CMIP6 models remain to be explored further. The CMIP6 experiments offer a unique opportunity for R s assessment and projections under various emission scenarios that reflect policy impacts and socio-economic risks [12,13]. However, uncertainties in climate projections can arise from internal climate variability, model uncertainty and scenario uncertainty [14][15][16][17]. The parametric and structural model uncertainty associated with the model's response to specified forcing agents explains a major part of uncertainty in mid-to long-term climate projection [18].
To improve the reliability of climate projection, emergent constraints (EC) provide a novel approach with a solid physical basis for narrowing the uncertainties of future climate projections through the combination of an ensemble of climate simulations with contemporary measurements [19,20]. While EC have served to narrow down the uncertainty in equilibrium climate sensitivity, atmospheric circulation pattern and air temperature projection, among others [19][20][21][22][23][24][25][26], how to constrain the R s projection remains to be explored through looking for a robust constraint relationship of R s .
In this paper, we assess the performance of R s simulations from 24 CMIP6 climate models and explore whether their R s biases over China are triggered by model biases in simulating total cloud cover and aerosol radiative effects. Then, we use historical biases of models to constrain the future projections of R s under three possible future scenarios with different shared socio-economic pathways (SSPs) (SSP1-2.6, SSP2-4.5 and SSP5-8.5) based on EC. Details of our analytical approach are in the 'Methods' section. Our results can provide the best estimate and confidence interval of the R s projection over China in the mid-twenty-first century and have important implications for the investment and construction of PV systems in China.

Model bias in simulating historical R s
The multi-model mean (MMM) R s of the CMIP6 historical all-forcing simulations presents a similar spatial pattern with the observations and the spatial correlation coefficient between the simulations and the observations over 2 • × 2 • grids is 0.93 (P < 0.05; Fig. 1a and b). Peaks of R s mainly concentrate over the Tibetan Plateau with ∼250 W·m −2 (Fig. 1a), due to low air mass, short cloud liquid water pathways and low anthropogenic aerosol concentrations. Lower values of R s mainly distribute in the east of China with the lowest value of ∼150 W·m −2 (Fig. 1a). The substantial reflection of clouds and the scattering and absorption of anthropogenic aerosols are responsible for low R s over these regions [27,28].
However, the CMIP6 MMM R s in the historical simulations is apparently overestimated at most grids (Fig. 1b), particularly in Northwest China and Southwest China (Fig. 1c). Compared with the observations, the CMIP6 MMM overestimates R s by 11.39 ± 7.20 W·m −2 (mean ± 1 standard deviation) averaged in China from 1961 to 2014 ( Fig. 1c and Table 1). The performance among CMIP6 models differs greatly, with the inter-model range of the R s biases averaged in China over 30 W·m −2 (Table 1), which is larger than those averaged over the globe [11]. Most models exhibit a consistent pattern of overestimation with a spatial correlation of ∼0.7 against the CMIP6 MMM, and only 2 out of 24 models present negative biases in R s ( Table 1).

Drivers of model bias in historical R s
Clouds and aerosols have been widely regarded as the main controlling factors for changes in R s [29,30]. The former can regulate R s by reflecting solar radiation [31] and the latter also play a critical role in R s changes due to their scattering and absorption of insolation, especially in regions with severe air pollution [32]. Although models have made significant progress in key physical processes of climate change, the biases in simulating these key processes may still be the major contributors to the biases of R s . The overall pattern of the total cloud cover fraction (TCC) can be simulated with a spatial correlation of 0.81 (P < 0.05) against the observed TCC in China (Supplementary Fig. S2a and b) but with evident spatial heterogeneities. For instance, models underestimate TCC in North China and Southeast China but overestimate in Northeast China, Northwest China and the Tibetan Plateau (Fig. 1d). The spatial pattern of the TCC bias is rather robust among the 24 models and the average value of the spatial correlation with the CMIP6 MMM is ∼0.85 ( Table 1). The spatial pattern of the TCC bias in the CMIP6 MMM matches well with that of the R s bias ( Fig. 1c and d), with a spatial correlation of -0.51 (P < 0.05).
The historical simulations of clear-sky surface downward solar radiation (R s-clear ) are also evaluated to identify the impact of aerosols on the simulated R s . First, the CMIP6 MMM R s-clear exhibits a similar pattern as observations with a spatial correlation of 0.94 (P < 0.05; Supplementary Fig. S2c and d) but it is positively biased in China, especially in Northeast China and the Tibetan Plateau (Fig. 1e). This pattern of the R s-clear bias is robust and consistent among the 24 models with a spatial correlation of ∼0.80 on average against the CMIP6 MMM (Table 1). Second, the simulated R s-clear shows an excessive decline from 1961 to 2014, while both ground-based observations and satellite measurements show an increase after 2008 ( Supplementary Fig. S1c). This is probably because the CMIP6 models significantly overload the decadal changes in anthropogenic aerosols in China [33]. Figure 2 shows the sensitivities of model bias in R s to TCC and R s-clear biases, where the R s bias has a larger partial correlation coefficient for the TCC bias in the south and the R s-clear bias in the north. Overall, the bias of R s in the CMIP6 MMM at the spatio-temporal scale can be explained by the combined effects of the simulated biases in TCC and R s-clear . In North China and Southeast China, the underestimation of TCC with high sensitivity to R s and the overestimation of R s-clear with low sensitivity to R s jointly result in the positive bias of R s (Figs 1ce and 2). In Northeast China, Northwest China and the Tibetan Plateau, the positive bias of R s is dominated by the overestimated R s-clear that is highly sensitive to R s , completely offsetting the overestimated TCC that is slightly sensitive to R s (Figs 1c-e and 2).
The inter-model relationships of the R s bias against the TCC and R s-clear biases can also offer additional insights into the possible causes of the R s bias (Fig. 3). The R s bias clearly demonstrates a strong inter-model correlation with the TCC bias (r = -0.67, P < 0.01), with a larger underestimation of TCC simulation responding to a larger overestimation of R s simulation (Fig. 3a), and this relationship is also robust over the grids (Supplementary Fig. S3a). This significant relationship between the R s and TCC biases suggests that the inter-model differences may have the same underlying physical Table 1. Overview of the CMIP6 models used in this study, their horizontal grids, multi-year mean biases (MB) of surface downward solar radiation (R s , in W·m −2 ), total cloud cover fraction (TCC, in %) and clear-sky surface downward solar radiation (R s-clear , in W·m −2 ) referenced to station observations averaged over China from 1961 to 2014, spatial correlation coefficient (r) of the biases between individual model simulation and the multi-model mean (MMM) of the CMIP6 historical all-forcing simulations over the grids with observations. The rightmost column is the model weight (w i , in %) estimated for the weighted emergent constraint method (weight-EC). drivers as the individual model bias has [22].
Although the inter-model relationship between the R s and R s-clear biases is weak on the national scale ( Fig. 3b), such significantly positive correlations can be seen at most grids in Northwest China and Southeast China (Supplementary Fig. S3b). Most models overestimate R s-clear to enhance the positive bias in the simulated R s or partly alleviate the effect of the TCC overestimation (Figs 3 and 1c-e).

Impact of model bias on the R s projections
The national mean R s simulated in three future SSP scenarios (SSP1-2.6, SSP2-4.5 and SSP5-8.   (weight-EC) and the other is based on regression (regression-EC). Considering the different advantages of the regression-EC and weight-EC methods, we average their constrained projections of R s (Fig. 4d-f) for better supporting decision-making in the future. Compared with the raw projections, the constrained projections in 2050-2069 are reduced in the national mean R s by ∼5%, i.e. from 196 to 185 W·m −2 in SSP1-2.6, 192 to 182 W·m −2 in SSP2-4.5 and 191 to 181 W·m −2 in SSP5-8.5 (Fig. 4d-f). More importantly, the projection uncertainties of R s are reduced by ∼56%, which significantly improves inter-model agreement in R s and substantially increases our confidence in the future projections of R s (Fig. 4d-f). Moreover, we find that the constraints using the combined effect of the TCC and R s-clear biases can account for ∼81% of the projection uncertainties using R s (Fig. 4d-f). The constraints are also applied to the grid scale and yield similar results as above.
Credible projection of R s values obtained by the EC with the help of historical observations can yield more realistic estimates for future changes in R s . Figure 5 shows the spatial patterns of future changes in R s calculated as the differences of the constrained simulations between 2050-2069 and 1995-2014. Compared with the raw future change, the constrained result of the CMIP6 MMM R s in SSP1-2.6 shows a higher level of brightening during 2050-2069 relative to the 1995-2014 mean in North China and Southeast China with higher power demand (Fig. 5a and d). The constrained change increases by ∼30% in China referenced to the raw projection in SSP1-2.6 ( Fig. 5a and d). With increased anthropogenic forcing in SSP2-4.5 and SSP5-8.5, the constrained future changes become weaker brightening in eastern China and more dimming in western China (Fig. 5b, c, e and f). This could imply that low anthropogenic emissions under the carbon-neutrality actions would increase future changes in R s and solar energy potential, consequently creating positive feedback for building a climate-resilient society. More importantly, the uncertainties of future changes in R s are reduced by 70% in North China and Southeast China and by 30% for the rest of China.

DISCUSSION
Effectiveness and confidence in the layout of solar PV systems heavily depend on reliable projections of R s . Considering model bias and scenario uncertainty, the future projections of R s in the mid-twentyfirst century by the CMIP6 models in three possible future emission scenarios are constrained with the help of historical observations in this study. Results show that future increase in the constrained R s is largest in SSP1-2.6, a low-emission scenario, followed by SSP2-4.5 and SSP5-8.5, median-and highemission scenarios (Fig. 5). According to the linear relationship between R s and electrical power by solar (see 'Methods' section) [34], the potential electrical power generated by solar in China at SSP1-2.6, SSP2-4.5 and SSP5-8.5 is estimated to be 44.3, 43.6 and 43.3 TW per year on average during 2050-2069, respectively. These results are more favorable for increasing the share of solar energy in the future for phasing out heavily polluting coal as its major energy source, which is conducive to the formation of positive feedback in the producing clean energytackling climate change under the carbon-neutrality actions.
In China, there are substantial regional variations in solar power generation potential affected by shortwave radiation, land availability and installation densities, showing a downward trend from northwest to southeast [35,36]. However, most western regions including Xinjiang, Inner Mongolia, Gansu, Qinghai and Tibet with huge solar PV generation potential have relatively low electricity demand and population density [35,37], so it needs huge costs to realize power transmission from west to east. Taking into account the cost of spatial dislocations between the PV power generation potential and electricity demand in China, and the need to improve air quality by reducing coal use in eastern China, the development of PV systems has recently begun to shift from west to east [37]. The higher level of brightening in the future in North China and Southeast China revealed by our constrained projections of R s in the low-emission scenario (Fig. 5) directly supports the west-to-east shift of the PV systems, which can make better use of future solar energy resources.
Limitations of our analysis are revealed here for better understanding our results. First, sunshine duration data are recorded by visually reading the burned signals on light-sensitive paper and TCC data are observed based on human eye, so the shift work of different meteorological observers may cause problems on the data homogeneity [38,39]. Our homogenization has removed most large discontinuities contained in these data, which can minimize the impact of the observational uncertainty. In addition, sparse ground-based observations over complex terrain in the Tibetan Plateau may also lack spatial representation compared to the rest of China [6]. Second, our methods work well to obtain the best estimates and their uncertainty of the national average R s projections by building a robust constraint relationship using model and observational data, but are not able to constrain R s on those grids without observational data, such as the western Tibetan Plateau (Fig. 5). Third, uncertainties in future projections may suffer from some limitations of different constraint methods. The weight-EC method presents a limitation in the projected R s because it weights the models by the posterior probabilities of their historical simulations against the observations and consequently 4 out of 24 models have large weights (Table 1). In contrast, the regression-EC method treats each model as equal and independent, and leverages the robust linear relationship to constrain the projections of R s . Its result may be disturbed by the models that are significantly inconsistent with the observations or that might share some interdependent modules despite being modified among the models [40], while the weight-EC could overcome these disadvantages [26]. On this basis, our result shown above is averaged from their R s projections constrained by the two methods, considering their different advantages and inherent systematic bias of climate models over China, so we believe that the constrained projections are more credible than the raw projections. Finally, while the relationship between R s and electrical power by solar is actually complex [41], a simple linear model is able to describe their relationship [34]. In summary, the significant systematic bias in R s in the CMIP6 models is identified in this study and its drivers are further revealed to be model biases in simulating TCC and R s-clear . These biases have significant impacts on future projections of R s and their uncertainties from these models. Observationbased EC through this robust relationship largely reduce the projection uncertainties of future R s by ∼56% and increase brightening by ∼30% in the future R s changes during 2050-2069 in China. The constrained projections of R s show a higher level of future brightening in North China and Southeast China, highlighting the need to consider the spatial changes in future R s when making policies or decisions associated with future solar energy deployment.

Observation and model data
Direct measurements of surface downward solar radiation (R s ) are conducted only at ∼100 stations in China. These direct R s measurements not only are unevenly located, but also suffer from series inhomogeneity due to instrument aging and instrument sensitivity drift problems [42][43][44], so that they are not able to well depict historical R s and its change in China. Unlike direct R s measurements, sunshine duration (SunDu) was measured at ∼2200 stations from 1961 to 2014. SunDu has been used to derive R s (SunDu-derived R s ; Equations (1) and (2)) [6,7] based on the revisedÅngström-Prescott equations [45]. The SunDu-derived R s avoids series inhomogeneities not only due to instrument aging and instrument sensitivity drift problems contained in direct R s measurements, but also due to largescale instrument replacements in 1990-1993 across China [42,43,45]. The SunDu-derived R s can well reproduce monthly R s values in China with a bias of 2.19 W·m −2 (1.4%) and a standard deviation of 19.32 W·m −2 (12.0%) compared with direct R s measurements and is able to describe monthly to decadal R s variability well [7,42], which has been used as reference data to assess the performance of R s in the reanalysis products and climate simulations [27,38,46]. Here, the SunDu-derived R s data set at ∼2200 stations in He et al. [7] is used as observations for comparisons with the CMIP6 historical allforcing simulations: where a 0 , a 1 and a 2 represent the regression coefficients of the sunshine duration against the R s observation; n and N represent the observed sunshine duration and theoretical sunshine duration, respectively; τ c is the radiative transmittance due to cloud extinction; I 0 is the solar irradiance at the top of the atmosphere; t is the time (in seconds); τ c dir and τ c dif denote the direct radiation transmittance and the diffuse radiation transmittance under clear skies, respectively, which are calculated through a broad radiative transfer model based on meteorological observations including near-surface air temperature, air pressure and relative humidity and the turbidity coefficient [45]. TCC measured at ∼2200 stations by the China Meteorological Administration (CMA, http://data.cma.cn/en) are used in this study. Noted is that TCC is observed based on the human eye and the number of stations with TCC observations decreased to 800 in 2014. To supplement TCC data for those stations without TCC observations in 2014, we apply inverse distance weighting interpolation to the observations at nearby stations that have a correlation coefficient of >0.7 with the anomaly of the candidate station during 1961-2013 [38,47].
R s and TCC data in the CMIP6 experiments (https://esgf-node.llnl.gov/search/cmip6/) are used in this study, including historical all-forcing simulations (HIST; 1961-2014) and future projections in three possible scenarios with different shared socio-economic pathways (SSP1-2.6, SSP2-4.5 and SSP5-8.5; 2015-2099). To ensure an equal weight for different CMIP6 models, the 'rlilp1f1' realizations are adopted in this study. There are 30 models providing both historical and future data, 24 out of which are selected by comparing the R s anomalies in the HIST simulations with those of observations via the Kolmogorov-Smirnov test at a significance level of 0.05. This method is often used to filter out some models with apparently inappropriate climate simulations [48]. Information of the models used are listed in Table 1.
We describe the clear-sky surface downward solar radiation (R s-clear ) by using the days with TCC of <15% as the clear-sky data [47]. It has been shown that there is little difference in the monthly R s climatology under different clear-sky thresholds (e.g. 10% and 15%) with the one under true cloud-free conditions (0%). To reduce the sampling effect of uneven observation sites and ensure spatial consistency between model grids and observation sites, we integrate all the data onto 2 • × 2 • grid boxes by aver-aging observations of all the sites within a grid box or bilinearly interpolating model grids. We calculate the national average by weighting the area of each grid.

Statistical analyses
Mean bias (MB), root-mean-square error (RMSE) and Pearson correlation coefficient (r) are used to quantify historical bias in R s , TCC and R s-clear of the CMIP6 historical all-forcing simulations against the observations: where m is the number of the R s , TCC or R s-clear data; S i and O i denote the i th data of the CMIP6 simulations and observations, respectively;S andŌ are the mean of the simulations and observations, respectively.
To quantify the sensitivities of R s bias to the TCC or R s-clear bias, we use partial least squares to statistically exclude the confounding effects of the other variable: ρ (x,y )|z = r x,y − r x,z r y ,z where ρ (x,y )|z is the partial correlation coefficient between x and y after controlling z; x represents the R s bias; y or z can be the TCC or R s-clear bias; r is the Pearson correlation coefficient.

EC
Two emergent constraints methods are used to constrain future projection: one is based on posterior probability weight (weight-EC) and the other is based on regression (regression-EC).

Weight-EC
This method is to estimate a posterior probability through an information-theoretic perspective according to how well the models reproduce the observations [26]. The estimated posterior probability is used for weighting the future projections given the climate models to yield robust constraints assuming a robust linear relationship with historical data. To this end, we first calculate the Kullback-Leibler divergence, also known as relative entropy or information divergence, to measure the asymmetry of the probability distributions of R s between the historical simulation of each model and the observation (Equation (7)) [49]. Then, we estimate the weight (w) by an information-theoretic distance measure (l) between each model and the observation (Equations (8) and (9)), which is also verified by visual comparison between them. The weights for all the models are listed in Table 1. We also identify the robust linear relationship between future projections of R s and historical R s bias (Fig. 4a-c) and, based on this relationship, we can obtain the probability density functions (PDF) of the constrained R s by applying the estimated posterior probability and then use a Gaussian kernel to estimate the mean and uncertainty of the projected R s : where O(x) is the PDF of the observation; s j (x) is the PDF of the historical simulation in the j th model; j is the Kullback-Leibler divergence of each model; l j is the information-theoretic distance describing how well the historical simulation reproduces the observation, which is quantified here as the likelihood of the j th model given the observed distribution; w j is the normalized weight of the j th model.

Regression-EC
This method is to leverage the robust inter-model relationship (Equation (11)) to constrain the future projection of R s with the help of the observation. The parametric uncertainty of the regression model and the observation uncertainty are considered. To account for the parametric uncertainty, we repeat the inter-model regression between future projections of R s and model biases of R s via 1000times bootstraps. To estimate the uncertainty of the observation average, we estimate the PDF of the average via 1000-times bootstraps of annual R s observations during 1961-2014. The observation uncertainty contained in the R s bias can be quantified by the PDF of the observation average: where Y is the future projection of R s from CMIP6 models in each of three possible future scenarios and X is the historical model bias in R s ; α is the regression slope and β is the intercept. The value of Y as X equals 0 is the constrained projection.
To constrain future projections of R s , we take the bias of 0 and its PDF as input data into each bootstrap regression equation (Equation (11)) to generate a pool of future constrained R s . As the weight-EC method, we also employ a Gaussian kernel with a bandwidth chosen to minimize the mean integrated squared error of the pool of future constrained R s , to estimate the mean and uncertainty of the projected R s . Previous studies [22,50] usually ignore either of them in their EC constraints, thereby weakening the estimated uncertainty.
To verify the validity of the constraint methods, we constrain the recent 20-year mean historical R s simulations of the CMIP6 models averaged over the grids with the observations in China during 1995-2014 based on the former period of 1961-1994 and compare the constrained results with the 1995-2014 observations. The comparison shows they match well and the uncertainties are significantly reduced ( Supplementary Fig. S5).

Estimate of the combined effect of TCC and R s-clear
To estimate the combined effect of TCC and R s-clear on R s in the observations from 1961 to 2014, a multiple linear regression is applied: where b 1 and b 2 are the regression coefficients; b 0 is the constant and ε is the residual. Then, we repeat the constraints using the combined effect of TCC and R s-clear , instead of R s , based on the weight-EC and regression-EC methods.

Estimate of electrical power generated by solar
A model [34] to estimate electrical power (EP) generated by solar is used to show its linear relationship with R s , as the following: where S represents the annual average R s in China during 2050-2069; η is the average conversion efficiency set at 26.9% in 2060 where air temperature plays a role [37]; α p is the panel albedo assumed to be 0.1 [34]; A p represents the suitable land area for PV power generation, which is set at